WHAT IS CLAIMED IS: 

l.An apparatus for generating a finite state transducer 
for use in incremental parsing, comprising: 

a recursive transition network creating device that 
5 creates a recursive transition network, the recursive transition 
network being a set of networks, each network representing a 
set of grammar rules based on a context-free grammar by states 
and arcs connecting the states, each arc having an input label 
and an output label, each network having a recursive structure 
10 where each transition labeled with a non- terminal symbol included 
in each of the networks is defined by another network; 

an arc replacement device that replaces an arc having an 
input label representing a start symbol included in the finite 
state transducer in an initial state by a network corresponding 
15 to the input label of the arc in the recursive transition network 
and further recursively repeats an arc replacement operation 
for replacing each arc, which is newly created from a replaced 
network, by another network in the recursive transition network; 
and 

20 a priority calculating device that calculates a derivation 

probability to derive a node of a parse tree corresponding to 
each of arcs whose input labels are non-terminal symbols in the 
finite state transducer based on statistical information 
regarding frequency of applying grammar rules and determines 

25 an arc replacement priority in terms of an obtained derivation 
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probability; 

wherein the arc replacement device continues applying the 
arc replacement operation to each arc included in the finite 
state transducer in descending order of the arc replacement 
priority until the finite state transducer reaches a 
predetermined size . 

2. The apparatus according to claim 1, further comprising 
an arc eliminating device that, after the application of the 
arc replacement operation by the arc replacement device 
terminates, eliminates arcs whose input labels are non-terminal 
symbols and further performs the arc replacement operation. 

3. The apparatus according to claim 1, wherein the 
derivation probability for a certain node represents a 
probability that grammar rules are applied in order to each node 



on a path from a root node to the certain node in the parse tree. 

4. The apparatus according to claim 3, wherein derivation 
probability P (Xr M (iM) ) for node Xr M( i M) is determined as follows: 




M 



wherein ri represents a grammar rule, ri(li) represents that 
grammar rule r± is applied and grammar rule ri+1 to be applied 
next is applied to a node generated by the (li)-th element of 
the right side of ri, and N is a predetermined positive integer. 

5 

5 . A computer-readable recording medium storing a program 
for generating a finite state transducer for use in incremental 
parsing, the program comprising: 

a recursive transition network creating routine that 

10 creates a recursive transition network, the recursive transition 
network being a set of networks, each network representing a 
set of grammar rules based on a context-free grammar by states 
and arcs connecting the states, each arc having an input label 
and an output label, each network having a recursive structure 

15 where each transit ion labeled with a non- terminal symbol included 
in each of the networks is defined by another network; 

an arc replacement routine that replaces an arc having 
an input label representing a start symbol included in the finite 
state transducer in an initial state by a network corresponding 

20 to the input label of the arc in the recursive transition network 
and further recursively repeats an arc replacement operation 
for replacing each arc, which is newly created from a replaced 
network, by another network in the recursive transition network; 
and 

25 a priority calculating routine that calculates a 
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derivation probability to derive a node of a parse tree 
corresponding to each of arcs whose input labels are non-terminal 
symbols in the finite state transducer based on statistical 
information regarding frequency of applying grammar rules and 
5 determines an arc replacement priority in terms of an obtained 
derivation probability; 

wherein the arc replacement routine continues applying 
the arc replacement operation to each arc included in the finite 
state transducer in descending order of the arc replacement 
10 priority until the finite state transducer reaches a 
predetermined size . 



6. The computer-readable recording medium according to 
15 claim 5, the program further comprising an arc eliminating 

routine that, after the application of the arc replacement 
operation by the arc replacement routine terminates, eliminates 
arcs whose input labels are non-terminal symbols and further 
performs the arc replacement operation. 

20 

7 . The computer-readable recording medium according to 
claim 5, wherein, in the program, the derivation probability 
for a certain node represents a probability that grammar rules 
are applied in order to each node on a path from a root node 

25 to the certain node in the parse tree. 
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8. The computer-readable recording medium according to 
claim 7 , wherein derivation probability P (Xr M( i M) ) for node Xr M( i M ) 
is determined as follows: 



wherein r ± represents a grammar rule, ri(li) represents that 
grammar rule r ± is applied and grammar rule r ± +l to be applied 
next is applied to a node generated by the (li)-th element of 
the right side of r ±/ and N is a predetermined positive integer. 

9. A method for generating a finite state transducer for 
use in incremental parsing comprising the steps of: 

creating a recursive transition network, the recursive 
transition network being a set of networks, each network 
representing a set of grammar rules based on a context-free 
grammar by states and arcs connecting the states, each arc having 
an input label and an output label, each network having a recursive 
structure where each transition labeled with a non-terminal 
symbol included in each of the networks is defined by another 
network; 

replacing an arc having an input label representing a start 
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symbol included in the finite state transducer in an initial 
state by a network corresponding to the input label of the arc 
in the recursive transition network and further recursively 
repeating an arc replacement operation for replacing each arc, 
5 which is newly created from a replaced network, by another network 
in the recursive transition network; and 

calculating a derivation probability to derive a node of 
a parse tree corresponding to each of arcs whose input labels 
are non-terminal symbols in the finite state transducer based 

10 on statistical information regarding frequency of applying 
grammar rules and determines an arc replacement priority in terms 
of an obtained derivation probability; 

wherein, in the step of replacing an arc, continuing 
applying the arc replacement operation to each arc included in 

15 the finite state transducer in descending order of the arc 
replacement priority until the finite state transducer reaches 
a predetermined size. 

10. The method according to claim 9, further comprising 
20 the step of eliminating arcs whose input labels are non-terminal 
symbols and further performs the arc replacement operation, after 
the application of the arc replacement operation by the arc 
replacement device terminates. 

25 11. The method according to claim 9, wherein the derivation 
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probability for a certain node represents a probability that 
grammar rules are applied in order to each node on a path from 
a root node to the certain node in the parse tree. 

5 12. The method according to claim 11, wherein derivation 

probability P (Xr M{ i M )) for node Xr M( i M) is determined as follows: 

P ( X r M (.t M )) 
M 

- II P( r i I *W+«it_» + i)i ■ . . , rt-i^)) 

wherein r± represents a grammar rule, ri(li) represents that 
grammar rule r± is applied and grammar rule r±+l to be applied 
10 next is applied to a node generated by the (li)-th element of 
the right side of r±, and N is a predetermined positive integer. 

13. An apparatus for incremental parsing, comprising: 
a finite state transducer generatedby themethodaccording 
15 to claim 7, the finite state transducer outputting one or more 

pieces of a parse tree as a result of a state transition when 

each word is inputted thereto; and 

a connecting device that sequentially connects each piece 

of the parse tree outputted by the finite state transducer. 

20 
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